Information Discovery based on Multi-granularity Text Fusion

نویسندگان

Qiaoyi HUANG

Yi WEI

چکیده

In this paper we introduce a new information discovery algorithm Multi-granularity Text Fusion (MGTF) on the Web. Granularity means the length of News relevant web documents, such as News web pages, Blog and Micro Blogs, which comes from web uses. The longer the text is, the higher of the granularity it has. Given a topic query on the Internet and the results of different granularity and time-stamped web documents which contain the query keywords, the task of MGTF is to orderly return those different granularity web documents discussed about the same topic. The process of multigranularity web documents analysis leads to heretofore unknown information and opinions that valuable potential, minority and contentious respectively, which integrates the time, content, reprint and link information. Experiments show that MGTF achieves the best overall performance with high effectiveness and robustness.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

User Interests Modeling Based on Multi-source Personal Information Fusion and Semantic Reasoning

User interests are usually distributed in different systems on the Web. Traditional user interest modeling methods are not designed for integrating and analyzing interests from multiple sources, hence, they are not very effective for obtaining comparatively complete description of user interests in the distributed environment. In addition, previous studies concentrate on the text level analysis...

متن کامل

A fusion approach for managing multi-granularity linguistic term sets in decision making

The aim of this paper is to present a fusion approach of multi-granularity linguistic information for managing information assessed in di erent linguistic term sets (multi-granularity linguistic term sets) together with its application in a decision making problem with multiple information sources, assuming that the linguistic performance values given to the alternatives by the di erent sources...

متن کامل

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

Short Text Hashing Improved by Integrating Multi-granularity Topics and Tags

Due to computational and storage efficiencies of compact binary codes, hashing has been widely used for large-scale similarity search. Unfortunately, many existing hashing methods based on observed keyword features are not effective for short texts due to the sparseness and shortness. Recently, some researchers try to utilize latent topics of certain granularity to preserve semantic similarity ...

متن کامل

Fusion of Thermal Infrared and Visible Images Based on Multi-scale Transform and Sparse Representation

Due to the differences between the visible and thermal infrared images, combination of these two types of images is essential for better understanding the characteristics of targets and the environment. Thermal infrared images have most importance to distinguish targets from the background based on the radiation differences, which work well in all-weather and day/night conditions also in land s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Information Discovery based on Multi-granularity Text Fusion

نویسندگان

چکیده

منابع مشابه

User Interests Modeling Based on Multi-source Personal Information Fusion and Semantic Reasoning

A fusion approach for managing multi-granularity linguistic term sets in decision making

A new model for persian multi-part words edition based on statistical machine translation

Short Text Hashing Improved by Integrating Multi-granularity Topics and Tags

Fusion of Thermal Infrared and Visible Images Based on Multi-scale Transform and Sparse Representation

عنوان ژورنال:

اشتراک گذاری